Files having unallocated portions within content addressable storage

ABSTRACT

A request to access to a logical location in a file stored in a content addressable storage (CAS) system can be handled by retrieving first tree data from a first node in a hash tree that represents the file, the first tree data including a first hash tree depth, a first CAS signature, a block size and a file size. Based on the tree data, a second node is selected from a higher level in the hash tree. Second tree data from the second node of the hash tree that represents the file is retrieved, including a second CAS signature. The second CAS signature is determined to match a reserved CAS signature, and in response, an indication that the requested logical location is unallocated within the file is provided.

BACKGROUND

This disclosure relates to data systems that store files withunpopulated or unallocated portions. In particular, it relates to accessto files within a system that provides content addressable storage.

Content addressable storage (CAS) allows for data to be stored usingidentifiers that are generated from the content of the data. This allowsfor the data to be retrieved using these identifiers without knowledgeof a physical address within the storage device. For instance, when afile (or data object) is stored in a CAS system, the CAS system cangenerate a signature that uniquely identifies the file content, at leastin a statistical sense. The CAS system can also specify the storagelocation for each identifier. This type of address is sometimes referredto as a “content address.”

Two or more data blocks that have identical data content (whether thedata blocks are duplicates of one another, or incidentally contain thesame data) will result in the same signature being generated for thefiles. Retrieval of the data content for any of these files will usethis common signature. Thus, a single location can store the data formultiple data objects and CAS system can reduce the storage spaceconsumed by files, and particularly for data backups and archives. CASsystems also facilitate authentication of files. For instance, due tothere being only one copy of a file, verifying legitimacy of the filecan be simplified.

SUMMARY

Embodiments are directed toward a method for processing a request toaccess to a logical location in a file stored in a content addressablestorage (CAS) system. The method includes retrieving first tree datafrom a first node in a hash tree that represents the file, the firsttree data including a first hash tree depth, a first CAS signature, ablock size and a file size. Based on the tree data, a second node isselected from a higher level in the hash tree. Second tree data isretrieved from the second node of the hash tree that represents thefile, the second tree data including a second CAS signature. The secondCAS signature is determined to match a reserved CAS signature. Inresponse to determining the second CAS signature matches the reservedCAS signature, an indication that the requested logical location isunallocated within the file is provided.

Various embodiments are directed toward a method for generating acontent addressable storage (CAS) signature for a file having allocatedand unallocated blocks. The method comprises generating a first level ofCAS signatures by applying a hash function to allocated blocks of thefile to generate CAS signatures, and applying a common CAS signature toeach unallocated block. The first level of CAS signatures is encodedwith a hash tree depth for the first level of CAS signatures, a blocksize and a file size. A second level of one or more CAS signatures isgenerated by: combining the first level of CAS signatures into groups,applying a hash function to groups of combined CAS signatures that arederived from at least one allocated block, applying the common CASsignature to at least one group with unallocated blocks, and encodingthe second level of CAS signatures with a hash tree depth for the secondlevel of CAS signatures, the block size and the file size.

Certain embodiments are directed toward a system for generating acontent addressable storage (CAS) signature for a file having allocatedand unallocated blocks. The system includes a client device configuredwith a client interface module that is designed to: retrieve first treedata from a first node in a hash tree that represents the file, thefirst tree data including a first hash tree depth, a first CASsignature, a block size and a file size; select, based on the tree data,a second node from a higher level in the hash tree; retrieve second treedata from the second node of the hash tree that represents the file, thesecond tree data including a second CAS signature; determine that thesecond CAS signature matches a reserved CAS signature; and provide, inresponse to determining the second CAS signature matches the reservedCAS signature, an indication that the requested logical location isunallocated within the file.

The above summary is not intended to describe each illustratedembodiment or every implementation of the present disclosure.

BRIEF DESCRIPTION OF SEVERAL VIEWS THE DRAWINGS

The drawings included in the present application are incorporated into,and form part of, the specification. They illustrate embodiments of thepresent disclosure and, along with the description, serve to explain theprinciples of the disclosure. The drawings are only illustrative ofcertain embodiments of the invention and do not limit the disclosure.

FIG. 1 depicts a CAS storage system, consistent with embodiments of thepresent disclosure;

FIG. 2 depicts a logical flow diagram for generating a hash tree,consistent with embodiments of the present disclosure;

FIG. 3 depicts logical mapping between a hash tree and data stored inCAS storage, consistent with embodiments of the present disclosure;

FIG. 4 depicts a flow diagram with nodes representing elements usefulfor generating a hash tree for a file with allocated and unallocatedportions, consistent with embodiments of the present disclosure;

FIG. 5 depicts a flow diagram with nodes representing elements usefulfor traversing a hash tree to obtain data from a file with allocated andunallocated portions, consistent with embodiments of the presentdisclosure;

FIG. 6 depicts a block diagram of a file or image with multipleversions, consistent with embodiments of the present disclosure;

FIG. 7 show a flow diagram containing nodes representing elements usefulfor generating a hash tree for a file with multiple versions, consistentwith embodiments of the present disclosure;

FIG. 8 depicts a flow diagram with nodes representing elements usefulfor traversing a hash tree to obtain data from a file with allocated andunallocated portions, consistent with embodiments of the presentdisclosure; and

FIG. 9 depicts a block diagram of a computer system for implementingvarious embodiments.

While the invention is amenable to various modifications and alternativeforms, specifics thereof have been shown by way of example in thedrawings and will be described in detail. It should be understood,however, that the intention is not to limit the invention to theparticular embodiments described. On the contrary, the intention is tocover all modifications, equivalents, and alternatives falling withinthe spirit and scope of the invention.

DETAILED DESCRIPTION

Aspects of the present disclosure relate to content addressable storage,more particular aspects relate to content addressable storage for fileswith unpopulated portions. While the present invention is notnecessarily limited to such applications, various aspects of theinvention may be appreciated through a discussion of various examplesusing this context.

Consistent with various embodiments of the present disclosure, a contentaddressable storage (CAS) system can be configured to generate asignature (hash) tree for a single file or data object. The hash treecan be constructed from multiple signatures of different portions(blocks) of a single file or data object. Particular embodiments relateto the use of a specified (reserved) signature to represent data blocksthat do not contain data that is meaningful (e.g., relative to the useof the file). Retrieval of data objects can be facilitated by traversingthe hash tree, while monitoring for the reserved signature.

Certain embodiments are directed toward encoding additional informationwith signatures contained in the hash tree. This additional informationcan include information that facilitates traversal of the hash tree froma starting hash node to a final hash (leaf) node that identifies thedesired data. For instance, the additional data can contain informationsuch as hash tree depth, file size and block size. Using thisinformation, the requesting device can traverse the hash tree until itencounters a reserved signature or is provided with the requested datablock. When encountering a reserved signature, the device can cease thetree traversal and end the corresponding request for data. This can beuseful for reducing amount of data transferred between the requestingdevice and the CAS system.

Various embodiments are directed toward accessing (storing, modifying orreading) files that can be relatively sparse in their populated content.For instance, disk images can contain data corresponding to a filesystem of an operating system (e.g., where the disk image is created forbackup purposes). The file system can specify that portions of the diskare unallocated file space. While the corresponding physical storagelocation of the disk can contain data, this data is not required torecreate the system image. For instance, data for a system image can beretrieved from a storage device, such as a hard disk drive. The systemmay have used/allocated less than half of the available disk drivespace. The unallocated portions will return binary data values ifaccessed; however, the embodiments of the present disclosure build uponthe recognition that these values are not relevant for manyapplications. Accordingly, embodiments are directed toward a CAS systemthat uses a hash tree to identify unallocated or unpopulated portions ofa file.

Embodiments of the present disclosure are directed toward a CAS systemthat accounts for different versions of a file being stored in the CASsystem. When a new version of a file is stored, a created hash tree maycontain only minor differences from the older version(s) of the file.Accordingly, embodiments are directed toward categorizing blocks thatare unchanged between subsequent versions so that reserved signaturescan be placed within the hash tree for corresponding blocks. Arequesting device that traverses the hash tree of a particular versionof a file may encounter a reserved signature. In response, therequesting device can traverse the hash tree of a previous version ofthe file or access a locally cached copy of the previous version of thefile. The process can be repeated until the data for the requested blockis retrieved or the requesting device encounters a reserved signature inthe original version of the file, which can signify that the data isunallocated.

Although the likelihood of a collision with a particular reservedsignature and actual data in a file can be statistically improbable(with proper selection of the hash function), certain embodiments allowfor the use of collision avoidance when generating a hash tree. Forinstance, the system can monitor the output of a hash function to detecta collision (match) with the reserved signature. If a match is detected,the system can apply an additional algorithm to modify the signature sothat it no longer matches the reserved signature. In certainembodiments, the additional algorithm could simply apply a second,predetermined signature that is reserved for collisions; however, a morecomplex algorithm could also be applied, such as applying a second hashfunction to the signature. If the system is designed to validate thedata, the validation algorithm can also include this collision detectionand solution so that it can also (re)generate the proper signature.

Turning now to the figures, FIG. 1 depicts a CAS storage system,consistent with embodiments of the present disclosure. Clients 102, 104,106 can be configured to access a CAS data center 116 using CASinterface modules 108, 110, 112. Network 114 can be used to route databetween clients and the data center. The network can include, but is notnecessarily limited to, one or more local area networks (LANs), one ormore wide area networks (WANs), the Internet and combinations thereof.The data center 116 can interface with the clients 102, 104, 106 using adatacenter CAS interface module 118, which can interface with one ormore hash tree tables 120 and CAS storage devices 122. In certainembodiments, the clients can cache local copies of a tree table orportions of a tree table.

Consistent with various embodiments, the CAS storage system can containa number of additional components and be configured in a variety ofmanners. For instance, one or more computers or servers can beconfigured to interface with the datacenter and with one or moreclients. The computers can be configured to provide functionality suchas, but not necessarily limited to, web-based access and portals, backupfunctions, archival functions, searching capability, system imagegeneration, user authentication, image/file authentication andcombinations thereof. For ease of discussion and illustration, FIG. 1may not expressly show each possible configuration.

Consistent with embodiments, the datacenter CAS interface module 118 caninclude a hash tree generation module 124, which can be configured togenerate a content addressable storage (CAS) signature for a file havingallocated and unallocated blocks. For instance, the file can be a systemor disk image for a computer system. The hash tree generation module cangenerate a first level of CAS signatures by applying a one way (hash)function to allocated blocks. The output of this function can be binarynumbers that are used as a set of CAS signatures. A few examples of aone way function include message digest algorithms (MD5) and secure hashalgorithms (SHA-1 or SHA-2). Other functions and hashes are possible. Asdiscussed herein, the one-way functions generate signatures that arestatistically unique relative to the data content of a correspondingblock. Thus, while collisions between signatures may be theoreticallypossible, the generated signatures can be considered unique in that theprobability of a collision is sufficiently low (e.g., much less likelythan data errors from other sources).

The hash tree generation module can use additional data for the file toidentify the unallocated blocks. As discussed in more detail herein, theunallocated blocks can include those blocks for which none of the datacontained therein is allocated. In place of a hash function, the hashtree generation module can use or apply a common (reserved) CASsignature to each unallocated block. For instance, the common CASsignature could be all binary zeroes. Other binary numbers can be usedfor the common CAS signature included, for example, a randomly generatedCAS signature. While it is theoretically possible that the common CASsignature may result in a collision with a particular allocated datablock, the hash function can be selected such that probability of thisis acceptably low.

The hash tree generation module can then encode the first level of CASsignatures with additional information. This additional information canbe selected to facilitate traversal of the hash tree that is beinggenerated. For instance, the additional information can include a hashtree depth for the first level of CAS signatures, a block size and afile size.

The hash tree generation module can next generate a second level of oneor more CAS signatures by combining the first level of CAS signaturesinto groups. The groups can include two or more CAS signatures. In someembodiments, the number of CAS signatures that are combined can dependupon the desired tree depth, file size and block size. A hash functionis then applied to the groups of combined CAS signatures. In certainembodiments, the hash function can be the same hash function that wasapplied to the original data blocks. Embodiments allow for differenthash functions to be used at different levels. The client CAS interfacemodules can be provided with information that allows for the proper hashfunction to be used at each level.

As discussed herein, when a group contains all unallocated blocks (e.g.,the group contains only reserved/common signatures), the common CASsignature propagates to the next level. Thus, the hash tree generationmodule can apply the common signature to the group with the unallocatedblocks. The hash tree generation module can then encode the second levelof CAS signatures with the additional information (e.g., a hash treedepth for the first level of CAS signatures, a block size and a filesize). This process can be iterated to generate additional levels of thehash tree and until a single signature is generated for a tree level.The single signature can be referred to as the root signature because aclient CAS interface module that has the root signature can access anydata block by traversing the hash tree.

Consistent with certain embodiments, the client CAS interface module canbe configured to retrieve data from the content addressable storage(CAS) system. The retrieved data can correspond to a request for accessto a logical location in a file, which can be received from anotherdevice, a program running on the client device or other sources. Theclient CAS interface module can retrieve data about the hash tree inorder to traverse the appropriate hash tree. For instance, the clientCAS interface module can retrieve the root signature that represents theappropriate file and decode information stored in a header of the rootsignature (or otherwise encoded therewith). This information can includetree data such as the hash tree depth, the client CAS signature (orscore), the block size for the hash tree and the file size.

The client CAS interface can then use the retrieved information totraverse the hash tree by selecting a second (child) signature and itsheader from a higher level in the hash tree. For instance, knowing theblock size, the tree depth and the file size allows the client CASinterface to determine the path through the hash tree that leads to thedesired data block. The client CAS interface can then compare the secondCAS signature to the reserved CAS signature, which indicates that allchild signatures (from higher levels in the tree) of the second CASsignature are unallocated. If there is a match, the client CAS interfacecan provide the requesting program or devices with an indication thatthe requested logical location is unallocated within the file. If thereis no match, the client CAS interface can retrieve either a CASsignature from the next level or the requested data (if there are nomore levels in the hash tree).

Embodiments of the present disclosure are directed toward generatingcontent addressable storage (CAS) signatures for different versions of afile. The data center CAS interface, using a hash tree generationmodule, can generate a first hash tree for a first version of the file,consistent with the discussions herein. This first hash tree can have afirst root CAS signature that is encoded with a depth for the first hashtree, a block size for the first hash tree and a file size for the file.When a new, second version of the file is received, the data center CASinterface can generate a second hash tree for a second version of thefile that accounts for commonalities and differences between the fileversions. For instance, the data center CAS interface can categorizeblocks in the second version of the file as either modified orunmodified, relative to a corresponding block in the first version ofthe file. A first level of CAS signatures for the second hash tree caninclude signatures for both modified and unmodified blocks. Thesignatures for the modified blocks can be generated by applying a hashfunction to the blocks. The signatures for the unmodified blocks can begenerated by applying a common/reserved CAS signature (e.g., all binaryzeroes). Additional levels of CAS signatures can be generated in asimilar manner until a second root CAS signature is created. Each of theCAS signatures, including the second root CAS signature, can be encodedwith a depth for the second hash tree, a block size for the second hashtree and a file size for the file. Accordingly, the “leaves” of the hashtree can each contain a score or signature, a hash tree depth, a blocksize and a file size.

Certain embodiments are directed toward the use of a reserved signatureto denote both (or either) unallocated file blocks and unmodified fileblocks. When the client CAS interface encounters the reserved signature,it can traverse back through hash trees of previous versions until iteither finds an unreserved signature or reaches the original version (inwhich case the corresponding block would be unallocated in the requestedversion of the file and in all previous versions).

Various embodiments allow for the use of two different reservedsignatures, one that indicates that the block is unallocated and anotherthat indicates that the block is unmodified. For instance, if a block isboth unallocated and unmodified (e.g., it was unallocated in bothversions of the file), then the block can be marked as unallocated. Whena client CAS interface encounters the signature for unallocated blocks,it can stop traversing the hash tree(s). When a client CAS interfaceencounters a signature for unmodified blocks, it can begin traversingthe hash tree for the previous version of the file and continueaccordingly.

FIG. 2 depicts a logical flow diagram for generating a hash tree,consistent with embodiments of the present disclosure. As discussedherein, a file (e.g., a disk image) can contain one or more unallocatedportions 202, 208, which are indicated by the shaded portions. These maycorrespond, for example, to locations on a disk that are marked as freeor unused by the corresponding file system(s). The file can be brokeninto different blocks according to a desired blocks size. These blockscan be categorized as either allocated (un-shaded) 206, 211 orunallocated (shaded) 204, 210 based upon the portion that theyrepresent. As shown by block 206, a block that contains both allocatedand unallocated data can be categorized as allocated.

Each of the blocks can then be transformed into a signature usingsignature generation functions. For allocated blocks, the data contentscan be input into a signature generator (un-shaded functional block 216)that uses a one-way (hash) function to generate a correspondingsignature (un-shaded signatures 222, 226). For unallocated blocks, asignature generator (shaded functional block 212, 214, 218) can output a(predetermined) reserved signature (shaded signatures 220, 221, 224).These signatures can be collectively stored as the first, or highest,level of the hash tree.

Two (or more) of these signatures can then be combined to create thenext level of the hash tree. For instance, if each of the signaturesbeing combined to make up the next level are unallocated (as shown byshaded signatures 220, 221), then the corresponding signature generator(shaded functional block 228) can output the reserved signature (shadedsignature 232). If at least one of the signatures being combined isallocated (as shown by un-shaded signature 226), then the combinedsignatures can be input into a signature generator (un-shaded functionalblock 230) that uses a one-way (hash) function to generate acorresponding signature (un-shaded signature 234). This process can berepeated until a single root signature is generated (un-shaded signature236).

Consistent with embodiments, the signature generators can be configuredto encode additional data into a header of each signature (or each nodeof the hash tree). As discussed herein, the encoded signature cancontain the depth within the hash tree, the block size for the hashtree, the file size of the file and the score from the one-way (hash)function.

FIG. 3 depicts logical mapping between a hash tree and data stored inCAS storage, consistent with embodiments of the present disclosure. Hashtree 300 corresponds to data stored in CAS storage 350. The hash treeincludes a root signature 302, which can be used to identify a file orimage which contains a plurality of data blocks 316, 318. The rootsignature 302 is at a first level and is linked with second levelsignatures 304, 306, which are linked with the third level signatures308, 310, 312, 314. The third level signatures represent the leaf nodelevel of the hash tree and are linked to a corresponding allocated datablocks 316, 318.

Consistent with embodiments, shaded signatures 306, 312, 314 representssignatures that correspond to one or more unallocated data blocks.Accordingly, signatures 306, 312, 314 can contain a common (reserved)signature that is not linked to data stored within CAS storage 350. Whena client CAS interface module seeks data blocks corresponding to eitherof signatures 312 and 314 it can begin with the root signature 302. Fromthere, the client CAS interface module can traverse the hash tree tosignature 306. The client CAS interface module can compare the signature306 to the reserved signature. If a match is found, the client CASinterface module can stop the traversal because the block being soughtis unallocated. This can save processing time and data bandwidth,relative to traversing the entire hash tree.

FIG. 4 depicts a flow diagram with nodes representing elements usefulfor generating a hash tree for a file with allocated and unallocatedportions, consistent with embodiments of the present disclosure. A hash(tree) generation module can process a received image file 402.According to some embodiments, the image file can be an image of astorage device, such as a disk drive. As discussed herein, the imagefile can be processed as data that is divided into a plurality ofblocks. To begin, the first block in the image can be set as the currentblock. The hash generation module can determine whether or not thecurrent block is allocated, per node 404. This determination caninclude, for instance, obtaining information from a file systemdirectory, metadata and similar sources.

If no portion of the current block is allocated, then the hashgeneration module can use a reserved signature to represent the block,per node 406. In certain embodiments, this reserved signature can be apredetermined signature bit value, such as all zeroes. It is alsopossible that the reserved signature is generated by the hash generationmodule or provided from another source. In instances where the reservedsignature is not known ahead of time, the reserved signature can beprovided to client CAS interface modules once it is known.

If at least a portion of the current block is allocated, then the hashgeneration module can apply a one-way function to the data contained inthe block to generate a signature for the block, per node 408.Additional information (file size, block size, tree depth) can then beencoded with the signature to generate a corresponding node of the hashtree, per node 410.

The hash generation module can next determine whether or not there areadditional blocks in the file or image, per node 412. If there areadditional bocks, then the next block can be set to the current block byincrementing the current block, per node 414. If there are no additionalblocks, then the hash generation module can advance to the next level ofthe hash tree by incrementing the current tree level, per node 416.

The leaves of the next tree level can be generated by combining multiplesignatures from the prior level and applying a one-way function to thedata contained in combination, per block 418. Additional information(file size, block size, tree depth) can then be encoded with thesignature to generate a corresponding node of the hash tree, per node420.

The hash generation module can next determine whether or not there areadditional signatures in the prior tree level, per node 422. If thereare additional signatures, then the current next signature can beupdated to include the next signature(s) by incrementing, per node 423.A new signature for a corresponding node can be generated by repeatingnodes 418 and 420. This process can be repeated until all signatures inthe prior tree level have been processed and there are no additionalsignatures in the prior level.

The hash generation module can determine whether or not the root levelof the hash tree has been reached, per node 424. If not, then the treelevel can be incremented, per node 416, and a new level can be generatedfor the hash tree. If root level has been reached, then the hash treecan end the tree generation, per node 426.

FIG. 5 depicts a flow diagram with nodes representing elements usefulfor traversing a hash tree to obtain data from a file with allocated andunallocated portions, consistent with embodiments of the presentdisclosure. When a client device desires one or more portions of a fileor image, the client device determines or identifies the correspondingsignature for the desired portion, per node 502. For instance, theclient device may seek a file that is contained in the Nth block of animage file for a hard disk drive. The client device can then identifyand retrieve the root signature for an image corresponding to theparticular hard disk drive.

From the identified signature, the client device can begin traversingthe hash tree associated with the identified signature, per node 504.For instance, the appropriate signature can be determined from theadditional information encoded in the first identified signature. As anexample, the client CAS interface module can determine which path in thehash tree corresponds to the desired Nth block from information aboutthe file size, the block size and the tree depth. The CAS system canreturn the appropriate hash tree signature from the next level.

In certain embodiments, the client CAS interface module can beconfigured to store a local (cached) copy of hash trees. For example,the client CAS interface module can cache a local copy by storinginformation obtained from traversing a particular hash tree.Accordingly, the client CAS interface module can first determine whetheror not the next signature (along the determine path) is stored locally,per node 506. If so, the client CAS interface module can use thesignature from local storage. If not, the client CAS interface moduledevice can retrieve the next signature from the CAS signature, per node508. This retrieve can include sending, to the CAS system, a requestthat identifies the appropriate location for the signature in the nextlevel of the hash tree.

The client CAS interface module can then increment the current signaturebeing analyzed to be the next signature, per node 510. The client CASinterface module can then determine whether or not the current signatureis equal to the reserved signature (representing unallocated block(s)),per node 512.

If the current signature is equal to the reserved signature, then thedesired Nth block is unallocated and there is no relevant data to beobtained from the block. Accordingly, the flow can end, per node 514.This can be useful for saving processing resources, time andcommunication bandwidth because the full tree does not necessarily haveto be traversed and because the actual data contained within theunallocated block does not need to be transferred. Further savings canbe obtained if the relevant portions of the hash tree are cachedlocally.

If the signature is not equal to the reserved signature, then the clientCAS interface module can determine whether or not the current signatureis at the top level (and there are not additional hash tree levels totraverse), per node 516. If not, then the client CAS interface modulecan return to node 504 to further traverse the hash tree.

If the current signature is at the leaf/top level of the hash tree, thenthe client CAS interface module can retrieve the actual data for the Nthdata block, per node 518. The process of retrieving the Nth data blockcan the end, per node 514.

FIG. 6 depicts a block diagram of a file or image with multipleversions, consistent with embodiments of the present disclosure. Certainembodiments are directed toward the use of reserved hash tree signaturesin combination with different versions of the same file. It has beenrecognized that in addition to, or separate from, unallocated portionsof a file, different versions of the same file can often containsignificant amounts of unchanged data.

For instance, multiple versions of an image of a hard disk drive can bestored as part of a periodic or continuing backup procedure. The firstimage file can be denoted as version 1 (V1). V1 might containunallocated portions 602, 604. When the V1 is stored in a CAS system,the resulting hash tree can be constructed to include reservedsignatures (606, 608) for blocks containing only data from theseunallocated portions.

When a second version (V2) of the file is saved in the CAS system, itmay contain sections of data 610, 612 that are identical to the firstversion. Consistent with embodiments of the present disclosure, the hashtree for V2 can be constructed to include reserved signatures 614, 616for blocks that do not contain any new data. When a third version (V3)of the file is saved to the CAS system, it may have sections of data618, 620 that are unchanged relative to V2. The hash tree for V3 can usereserved signatures 622, 624 for blocks that do not contain any newdata.

The root signatures for the different versions of the file can be usedby a requesting client device to obtain data for the desired version.For instance, a client device may be provided with a root signature fora new version of a file. The client device may already have a local copyof the previous version of the file. When the client device seeks datafrom the new version of the file, it can use the corresponding rootsignature to traverse the hash tree of the newer version. When theclient device encounters a reserved signature during the traversal, thiscan indicate that the desired content is the same in both versions ofthe file. This can be particularly useful for allowing the client deviceto use a local copy of the older version of the file for portions thatare identical in both versions, while still being able to obtain changedportions from the CAS storage.

FIG. 7 shows a flow diagram containing nodes representing elementsuseful for generating a hash tree for a file with multiple versions,consistent with embodiments of the present disclosure. When a file orimage 702 is received for storage in the CAS system, a hash (tree)generation module can determine whether or not the file has one or moreprior versions already stored in the CAS system, per node 704. Thisdetermination can be made a number of different ways depending upon theparticular application. For instance, the program that provides theimage for storage can also provide a root signature for a prior versionof the file. In other instances, the CAS system can create a table thatincludes metadata that helps to define version histories for files.

If CAS system does not contain a prior version, then a hash tree can begenerated, per node 706. In some embodiments, the generation of the hashtree can be carried out without the use of reserved signatures. In otherembodiments, the generation of the hash tree can be carried out usingreserved signatures for unallocated portions of the file (e.g.,proceeding from node A in FIG. 4).

If the CAS system does have a prior version of the file, then the CASsystem can obtain a difference report, per node 708. In someembodiments, the difference report can be obtained directly from thereceived version of the file. For instance, systems that use incrementalbackups for disk images can be configured to provide a partial data filecontaining only those portions of the image that changed relative to theprior version. Accordingly, the CAS system can determine which blockshave changed based on whether or not (new) data was provided for aparticular group of blocks.

In some instances, only a portion of a block may be changed while thehash tree generation is accomplished on a per block basis based uponboth the changed and unchanged data. Consistent with certainembodiments, the system providing the partial file can include data foran entire block if any of the data therein is changed. Variousembodiments allow for the CAS system to supplement the partial file byretrieving unchanged data for a block with only a portion of the databeing provided by the incremental backup system.

Based upon the difference report obtained from node 708, the CAS systemcan identify those blocks that have not been modified, per node 710. TheCAS system can then apply reserved signatures to the identified blocks,per node 712. This can also include the encoding of additionalinformation, consistent with the various discussions herein. The CASsystem can then generate the remainder of the hash tree consistent withthe various teachings herein. For instance, the hash tree can begenerated by continuing at node B in FIG. 4, where the reservedsignatures represent unmodified portions and can propagate to higherlevels of the tree as discussed herein. In other instances, the hashtree can be generated by continuing at node A in FIG. 4. In suchinstances, additional reserved signatures can be added for unallocatedportions of the file. The reserved signatures can thereby representblocks of data that are either unmodified (relative to a prior fileversion), unallocated (relative to a file system) or both.

FIG. 8 depicts a flow diagram with nodes representing elements usefulfor traversing a hash tree to obtain data from a file with allocated andunallocated portions, consistent with embodiments of the presentdisclosure. When a client CAS interface module seeks to retrieve a datablock from a CAS system, it can first identify the signature for imagecontaining the desired block, per node 802. This can include, forinstance, identifying not only the desired file, but also the desiredversion of the file. The client CAS interface module may have multipleroot signatures for the same file, wherein the root signatures eachrepresent a different version of the file.

The client CAS interface module can then begin traversing the hash treecorresponding to the identified signature, per node 804. As discussedherein, this traversal can include the identification of the nextsignature from the next level of the hash tree, which might be stored ina local cache of the hash tree. Accordingly, the client CAS interfacemodule can determine whether or not the next signature is storedlocally, per node 806. If not, the client CAS interface can retrieve thenext signature from the CAS system, per node 808.

Whether the next signature is obtained from a local cache or the CASsystem, the client CAS interface module can update the current signatureto be that of the next signature, per node 810. The client CAS interfacemodule can then determine whether the (now updated) current signature isa reserved signature, per node 812. If so, then this indicates that therequested block is either unchanged or unallocated (or both if thereserved signature is used for both unchanged and unallocated blocks).Accordingly, the client CAS interface module can determine whether ornot there is a previous or older version of the file or image, per node814. If there is an older version, then this older version can be set asthe current version by identifying and using a signature for the olderversion, per node 816. The hash tree for this signature can then betraversed per node 804. If there is not an older version then this mayindicate that the block is unallocated (or that there is a problem withthe hash tree), and the retrieval can end, per node 822.

If the current signature is not a reserved signature, then the clientCAS interface module can determine whether or not the current signatureis the leaf, or top, level of the hash tree, per node 818. If not, thenthe client CAS interface module can continue traversing the hash tree,per node 804. If so, then the current signature can be used to retrievedata from the CAS storage system, per node 820. The retrieval processfor this block can then end, per node 822.

FIG. 9 depicts a block diagram of a computer system for implementingvarious embodiments. The mechanisms and apparatus of the variousembodiments disclosed herein apply equally to any appropriate computingsystem. The major components of the computer system 900 include one ormore processors 902, a memory 904, a terminal interface 912, a storageinterface 914, an I/O (Input/Output) device interface 916, and a networkinterface 918, all of which are communicatively coupled, directly orindirectly, for inter-component communication via a memory bus 906, anI/O bus 908, bus interface unit 909, and an I/O bus interface unit 910.

The computer system 900 may contain one or more general-purposeprogrammable central processing units (CPUs) 902A and 902B, hereingenerically referred to as the processor 902. In embodiments, thecomputer system 900 may contain multiple processors; however, in certainembodiments, the computer system 900 may alternatively be a single CPUsystem. Each processor 902 executes instructions stored in the memory904 and may include one or more levels of on-board cache.

In embodiments, the memory 904 may include a random-access semiconductormemory, storage device, and/or storage medium (either volatile ornon-volatile) for storing and/or encoding data and programs. In certainembodiments, the memory 904 represents the entire virtual memory of thecomputer system 900, and may also include the virtual memory of othercomputer systems coupled to the computer system 900 or connected via anetwork. The memory 904 can be conceptually viewed as a singlemonolithic entity, but in other embodiments the memory 904 is a morecomplex arrangement, such as a hierarchy of caches and other memorydevices. For example, memory may exist in multiple levels of caches, andthese caches may be further divided by function, so that one cache holdsinstructions while another holds non-instruction data, which is used bythe processor or processors. Memory may be further distributed andassociated with different CPUs or sets of CPUs, as is known in any ofvarious so-called non-uniform memory access (NUMA) computerarchitectures.

The memory 904 may store all or a portion of the various programs,modules and data structures for processing data transfers as discussedherein. For instance, the memory 904 can store a client CAS interfacetool or module 950 or datacenter CAS interface tool or module 960.Consistent with certain embodiments, these tools can be implemented aspart of one or more database systems. These programs and data structuresare illustrated as being included within the memory 904 in the computersystem 900, however, in other embodiments, some or all of them may be ondifferent computer systems and may be accessed remotely, e.g., via anetwork. The computer system 900 may use virtual addressing mechanismsthat allow the programs of the computer system 900 to behave as if theyonly have access to a large, single storage entity instead of access tomultiple, smaller storage entities. Thus, while the client CAS interfacemodule 950 and the datacenter CAS interface module 960 are illustratedas being included within the memory 904, these components are notnecessarily all completely contained in the same storage device at thesame time. Further, although the client CAS interface module 950 and thedatacenter CAS interface tool or module 960 are illustrated as beingseparate entities, in other embodiments some of them, portions of someof them, or all of them may be packaged together with other functionsand modules.

In embodiments, the client CAS interface module 950 or the datacenterCAS interface tool or module 960 may include instructions or statementsthat execute on the processor 902 or instructions or statements that areinterpreted by instructions or statements that execute on the processor902 to carry out the functions as further described below. In certainembodiments, the client CAS interface module 950 or the datacenter CASinterface tool or module 960 can be implemented in hardware viasemiconductor devices, chips, logical gates, circuits, circuit cards,and/or other physical hardware devices in lieu of, or in addition to, aprocessor-based system. In embodiments, the client CAS interface module950 or datacenter CAS interface tool or module 960 may include data inaddition to instructions or statements.

The computer system 900 may include a bus interface unit 909 to handlecommunications among the processor 902, the memory 904, a display system924, and the I/O bus interface unit 910. The I/O bus interface unit 910may be coupled with the I/O bus 908 for transferring data to and fromthe various I/O units. The I/O bus interface unit 910 communicates withmultiple I/O interface units 912, 914, 916, and 918, which are alsoknown as I/O processors (IOPs) or I/O adapters (I0As), through the I/Obus 908. The display system 924 may include a display controller, adisplay memory, or both. The display controller may provide video,audio, or both types of data to a display device 926. The display memorymay be a dedicated memory for buffering video data. The display system924 may be coupled with a display device 926, such as a standalonedisplay screen, computer monitor, television, or a tablet or handhelddevice display. In one embodiment, the display device 926 may includeone or more speakers for rendering audio. Alternatively, one or morespeakers for rendering audio may be coupled with an I/O interface unit.In alternate embodiments, one or more of the functions provided by thedisplay system 924 may be on board an integrated circuit that alsoincludes the processor 902. In addition, one or more of the functionsprovided by the bus interface unit 909 may be on board an integratedcircuit that also includes the processor 902.

The I/O interface units support communication with a variety of storageand I/O devices. For example, the terminal interface unit 912 supportsthe attachment of one or more user I/O devices 920, which may includeuser output devices (such as a video display device, speaker, and/ortelevision set) and user input devices (such as a keyboard, mouse,keypad, touchpad, trackball, buttons, light pen, or other pointingdevice). A user may manipulate the user input devices using a userinterface, in order to provide input data and commands to the user I/Odevice 920 and the computer system 900, and may receive output data viathe user output devices. For example, a user interface may be presentedvia the user I/O device 920, such as displayed on a display device,played via a speaker, or printed via a printer.

The storage interface 914 supports the attachment of one or more diskdrives or direct access storage devices 922 (which are typicallyrotating magnetic disk drive storage devices, although they couldalternatively be other storage devices, including arrays of disk drivesconfigured to appear as a single large storage device to a hostcomputer, or solid-state drives, such as flash memory). In someembodiments, the storage device 922 may be implemented via any type ofsecondary storage device. The contents of the memory 904, or any portionthereof, may be stored to and retrieved from the storage device 922 asneeded. The I/O device interface 916 provides an interface to any ofvarious other I/O devices or devices of other types, such as printers orfax machines. The network interface 918 provides one or morecommunication paths from the computer system 900 to other digitaldevices and computer systems; these communication paths may include,e.g., one or more networks 930.

Although the computer system 900 shown in FIG. 9 illustrates aparticular bus structure providing a direct communication path among theprocessors 902, the memory 904, the bus interface 909, the displaysystem 924, and the I/O bus interface unit 910, in alternativeembodiments the computer system 900 may include different buses orcommunication paths, which may be arranged in any of various forms, suchas point-to-point links in hierarchical, star or web configurations,multiple hierarchical buses, parallel and redundant paths, or any otherappropriate type of configuration. Furthermore, while the I/O businterface unit 910 and the I/O bus 908 are shown as single respectiveunits, the computer system 900 may, in fact, contain multiple I/O businterface units 910 and/or multiple I/O buses 908. While multiple I/Ointerface units are shown, which separate the I/O bus 908 from variouscommunications paths running to the various I/O devices, in otherembodiments, some or all of the I/O devices are connected directly toone or more system I/O buses.

In various embodiments, the computer system 900 is a multi-usermainframe computer system, a single-user system, or a server computer orsimilar device that has little or no direct user interface, but receivesrequests from other computer systems (clients). In other embodiments,the computer system 900 may be implemented as a desktop computer,portable computer, laptop or notebook computer, tablet computer, pocketcomputer, telephone, smart phone, or any other suitable type ofelectronic device.

FIG. 9 depicts a representative set of certain major components of thecomputer system 900. Individual components, however, may have greatercomplexity than represented in FIG. 9, components other than or inaddition to those shown in FIG. 9 may be present, and the number, type,and configuration of such components may vary. Several particularexamples of additional complexity or additional variations are disclosedherein; these are by way of example only and are not necessarily theonly such variations. The various program components illustrated in FIG.9 may be implemented, in various embodiments, in a number of differentmanners, including using various computer applications, routines,components, programs, objects, modules, data structures, etc., which maybe referred to herein as “software,” “computer programs,” or simply“programs.”

As will be appreciated by one skilled in the art, aspects of the presentinvention may be embodied as a system, method or computer programproduct. Accordingly, aspects of the present invention may take the formof an entirely hardware embodiment, an entirely software embodiment(including firmware, resident software, micro-code, etc.) or anembodiment combining software and hardware aspects that may allgenerally be referred to herein as a “circuit,” “module” or “system.”Furthermore, aspects of the present invention may take the form of acomputer program product embodied in one or more computer readablemedium(s) having computer readable program code embodied thereon.

Any combination of one or more computer readable medium(s) may beutilized. The computer readable medium may be a computer readable signalmedium or a computer readable storage medium. A computer readablestorage medium may be, for example, but not limited to, an electronic,magnetic, optical, electromagnetic, infrared, or semiconductor system,apparatus, or device, or any suitable combination of the foregoing. Morespecific examples (a non-exhaustive list) of the computer readablestorage medium would include the following: an electrical connectionhaving one or more wires, a portable computer diskette, a hard disk, arandom access memory (RAM), a read-only memory (ROM), an erasableprogrammable read-only memory (EPROM or Flash memory), an optical fiber,a portable compact disc read-only memory (CD-ROM), an optical storagedevice, a magnetic storage device, or any suitable combination of theforegoing. In the context of this document, a computer readable storagemedium may be any tangible medium that can contain, or store a programfor use by or in connection with an instruction execution system,apparatus, or device.

A computer readable signal medium may include a propagated data signalwith computer readable program code embodied therein, for example, inbaseband or as part of a carrier wave. Such a propagated signal may takeany of a variety of forms, including, but not limited to,electro-magnetic, optical, or any suitable combination thereof. Acomputer readable signal medium may be any computer readable medium thatis not a computer readable storage medium and that can communicate,propagate, or transport a program for use by or in connection with aninstruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wireline, optical fiber cable, RF, etc. or any suitable combination ofthe foregoing.

Computer program code for carrying out operations for aspects of thepresent invention may be written in any combination of one or moreprogramming languages, including an object oriented programming languagesuch as Java, Smalltalk, C++ or the like and conventional proceduralprogramming languages, such as the “C” programming language or similarprogramming languages. The program code may execute entirely on theuser's computer, partly on the user's computer, as a stand-alonesoftware package, partly on the user's computer and partly on a remotecomputer or entirely on the remote computer or server. In the latterscenario, the remote computer may be connected to the user's computerthrough any type of network, including a local area network (LAN) or awide area network (WAN), or the connection may be made to an externalcomputer (for example, through the Internet using an Internet ServiceProvider).

Aspects of the present invention are described below with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems) and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer program instructions. These computer program instructions maybe provided to a processor of a general purpose computer, specialpurpose computer, or other programmable data processing apparatus toproduce a machine, such that the instructions, which execute via theprocessor of the computer or other programmable data processingapparatus, create means for implementing the functions/acts specified inthe flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computerreadable medium that can direct a computer, other programmable dataprocessing apparatus, or other devices to function in a particularmanner, such that the instructions stored in the computer readablemedium produce an article of manufacture including instructions whichimplement the function/act specified in the flowchart and/or blockdiagram block or blocks.

The computer program instructions may also be loaded onto a computer,other programmable data processing apparatus, or other devices to causea series of operational steps to be performed on the computer, otherprogrammable apparatus or other devices to produce a computerimplemented process such that the instructions which execute on thecomputer or other programmable apparatus provide processes forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods and computer program products according to variousembodiments of the present invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof code, which comprises one or more executable instructions forimplementing the specified logical function(s). It should also be notedthat, in some alternative implementations, the functions noted in theblock may occur out of the order noted in the figures. For example, twoblocks shown in succession may, in fact, be executed substantiallyconcurrently, or the blocks may sometimes be executed in the reverseorder, depending upon the functionality involved. It will also be notedthat each block of the block diagrams and/or flowchart illustration, andcombinations of blocks in the block diagrams and/or flowchartillustration, can be implemented by special purpose hardware-basedsystems that perform the specified functions or acts, or combinations ofspecial purpose hardware and computer instructions.

Although the present disclosure has been described in terms of specificembodiments, it is anticipated that alterations and modificationsthereof will become apparent to those skilled in the art. Therefore, itis intended that the following claims be interpreted as covering allsuch alterations and modifications as fall within the true spirit andscope of the disclosure.

What is claimed is:
 1. A computer-implemented method for transferring a portion of a file stored in a non-transitory content addressable storage (CAS) system to a second non-transitory computer readable storage medium, the computer-implemented method comprising: retrieving, in response to a request for the portion of the file, first tree data from a first node in a hash tree that represents the file, the first tree data including a first hash tree depth, a first CAS signature, a block size and a file size, wherein the first node represents a set of data blocks corresponding to the portion of the file; selecting, based on the tree data, a second node from a higher level in the hash tree, wherein the second node represents a subset of data blocks represented by the first node; retrieving second tree data from the second node of the hash tree that represents the file, the second tree data including a second CAS signature; determining that the second CAS signature matches a reserved CAS signature, wherein the reserved CAS signature represents one or more unallocated data blocks; and providing, to the second non-transitory computer readable storage medium and in response to the request, and in response to determining the second CAS signature matches the reserved CAS signature, an indication that the file contains one or more unallocated data blocks; and transferring a set of data blocks corresponding to the requested portion of the file from the CAS system and to the second non-transitory computer readable storage medium, wherein the set of data blocks does not include the subset of data blocks corresponding to the second node and encoded with the reserved CAS signature.
 2. The method of claim 1, wherein the file is a disk image that corresponds to a file system with allocated and unallocated portions.
 3. The method of claim 1, wherein selecting a second node from a higher level in the hash tree includes: identifying a data block within the requested portion of the file; and determining, based upon the data block, a path through the hash tree from the first node to the second node, wherein the second node represents one or more data blocks including the identified data block.
 4. The method of claim 1, wherein the reserved CAS signature is a predetermined binary number.
 5. The method of claim 4, wherein the reserved CAS signature is zero.
 6. The method of claim 1, wherein retrieving the first tree data includes sending a request to a CAS datacenter.
 7. The method of claim 6, further wherein the request is sent to a web accessible interface of the CAS datacenter.
 8. A system for generating a content addressable storage (CAS) signature for a file having allocated and unallocated blocks, the system comprising: a client device configured with a client interface module, a non-transitory computer readable storage medium, and a processor that is configured to: generate a first level of CAS signatures by: applying a hash function to allocated blocks of the file to generate CAS signatures, and applying a common CAS signature, wherein the common CAS signature represents one or more unallocated blocks, to each unallocated block; encoding the first level of CAS signatures with a hash tree depth for the first level of CAS signatures, a block size and a file size; and generate a second level of one or more CAS signatures by: combining the first level of CAS signatures into groups, applying a hash function to groups of combined CAS signatures that are derived from at least one allocated block, applying the common CAS signature to at least one group with unallocated blocks, encoding the second level of CAS signatures with a hash tree depth for the second level of CAS signatures, the block size and the file size; and store the first level of CAS signatures and the second level of one or more CAS signatures in the non-transitory computer readable storage medium; wherein the processor is configured to, responsive to a request from the client interface module to restore the file, output a set of blocks corresponding to the file; and wherein the set of blocks does not contain any block encoded with the common CAS signature.
 9. The system of claim 8, wherein the file is a disk image that corresponds to a file system with allocated and unallocated portions.
 10. The system of claim 9, further comprising determining whether the blocks of the file are allocated or unallocated based upon directory information corresponding to the file system.
 11. The system of claim 8, wherein the common CAS signature is a predetermined bit value.
 12. The system of claim 8, further comprising generating a root CAS signature for the hash tree.
 13. The system of claim 8, wherein the common CAS signature is a predetermined binary number.
 14. The system of claim 13, wherein the common CAS signature is zero.
 15. The system of claim 8, wherein the client device comprises at least one computer of a CAS datacenter.
 16. A computer program product for loading a file from a content addressable storage (CAS) system having allocated and unallocated blocks, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: retrieve, responsive to a request for the file, first tree data from a first node in a hash tree that represents the file, the first tree data including a first hash tree depth, a first CAS signature, a block size and a file size; retrieve respective tree data from respective higher nodes of the hash tree that represents the file, the respective tree data including respective CAS signatures; determine that at least one respective CAS signature corresponding to a respective node matches a reserved CAS signature, wherein the reserved CAS signature represents one or more unallocated blocks; and output, in response to retrieving respective tree data, a set of blocks corresponding to a set of nodes in the hash tree that represents the file, wherein the set of blocks does not include respective blocks corresponding to respective nodes having a CAS signature matching the reserved CAS signature.
 17. The computer program product of claim 16, wherein the program instructions are further configured to cause the processor to: receive a file to store; generate a first level of CAS signatures representing respective blocks of the file by: applying a hash function to allocated blocks of the file to generate CAS signatures, and applying a reserved CAS signature to each unallocated block; encode the first level of CAS signatures with a hash tree depth for the first level of CAS signatures, the block size and the file size; generate a second level of one or more CAS signatures by: combining the first level of CAS signatures into groups, applying a hash function to groups of combined CAS signatures that are derived from at least one allocated block, and applying the reserved CAS signature to at least one group with unallocated blocks; and encode the second level of CAS signatures with a hash tree depth for the second level of CAS signatures, the block size and the file size; and store the first level of CAS signatures and the second level of CAS signatures.
 18. The computer program product of claim 17, wherein the program instructions are further configured to cause the processor to: generate additional levels of one or more CAS signatures until a root CAS signature is generated, wherein the root CAS signature comprises a CAS signature representing the file; and store the root CAS signature and all respective CAS signatures of all respective levels.
 19. The computer program product of claim 16, wherein the program instructions configured to cause the processor to retrieve respective tree data from respective higher nodes of the hash tree are further configured to cause the processor to: in response to determining that a respective CAS signature of a respective node matches the reserved CAS signature, selecting a different respective node higher than the first node and not associated with a previously selected node having a CAS signature matching the reserved CAS signature; and in response to determining that the CAS signature of a respective node does not match the reserved CAS signature, and in response to determining that the respective node is a top level node, retrieving a block associated with the respective node.
 20. A computer program product for generating a content addressable storage (CAS) signature for a file having allocated and unallocated blocks, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processor to cause the processor to: generate a first level of CAS signatures by: applying a hash function to allocated blocks of the file to generate CAS signatures, and applying a common CAS signature, wherein the common CAS signature represents one or more unallocated blocks, to each unallocated block; encoding the first level of CAS signatures with a hash tree depth for the first level of CAS signatures, a block size and a file size; and generate a second level of one or more CAS signatures by: combining the first level of CAS signatures into groups, applying a hash function to groups of combined CAS signatures that are derived from at least one allocated block, applying the common CAS signature to at least one group with unallocated blocks, encoding the second level of CAS signatures with a hash tree depth for the second level of CAS signatures, the block size and the file size; and store the first level of CAS signatures and the second level of one or more CAS signatures; wherein the program instructions are configured to cause the processor to, responsive to a request, output a set of blocks corresponding to the file; and wherein the set of blocks does not contain any block encoded with the common CAS signature. 